Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

نویسندگان

Zebang Shen

Hui Qian

Tongzhou Mu

Chao Zhang

چکیده

Nowadays, algorithms with fast convergence, small memory footprints, and low per-iteration complexity are particularly favorable for artificial intelligence applications. In this paper, we propose a doubly stochastic algorithm with a novel accelerating multi-momentum technique to solve large scale empirical risk minimization problem for learning tasks. While enjoying a provably superior convergence rate, in each iteration, such algorithm only accesses a mini batch of samples and meanwhile updates a small block of variable coordinates, which substantially reduces the amount of memory reference when both the massive sample size and ultra-high dimensionality are involved. Specifically, to obtain an -accurate solution, our algorithm requires only O(log(1/ )/ √ ) overall computation for the general convex case and O((n + √ nκ) log(1/ )) for the strongly convex case. Empirical studies on huge scale datasets are conducted to illustrate the efficiency of our method in practice.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

In this paper, we develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings. The use of mini-batches is becoming a golden standard in the machine learning community, because mini-batch settings stabilize the gradient estimate and can easily make good use of parallel computing. The core of our pr...

متن کامل

SGD with Variance Reduction beyond Empirical Risk Minimization

We introduce a doubly stochastic proximal gradient algorithm for optimizing a finite average of smooth convex functions, whose gradients depend on numerically expensive expectations. Our main motivation is the acceleration of the optimization of the regularized Cox partial-likelihood (the core model used in survival analysis), but our algorithm can be used in different settings as well. The pro...

متن کامل

Doubly Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization with Factorized Data

We proposed a doubly stochastic primal-dual coordinate optimization algorithm for regularized empirical risk minimization that can be formulated as a saddlepoint problem. Different from existing coordinate methods, the proposed method randomly samples both primal and dual coordinates to update solutions, which is a desirable property when applied to data with both a high dimension and a large s...

متن کامل

Distributed Stochastic Variance Reduced Gradient Methods and A Lower Bound for Communication Complexity

We study distributed optimization algorithms for minimizing the average of convex functions. The applications include empirical risk minimization problems in statistical machine learning where the datasets are large and have to be stored on different machines. We design a distributed stochastic variance reduced gradient algorithm that, under certain conditions on the condition number, simultane...

متن کامل

Stochastic Three-Composite Convex Minimization

We propose a stochastic optimization method for the minimization of the sum of three convex functions, one of which has Lipschitz continuous gradient as well as restricted strong convexity. Our approach is most suitable in the setting where it is computationally advantageous to process smooth term in the decomposition with its stochastic gradient estimate and the other two functions separately ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

نویسندگان

چکیده

منابع مشابه

Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

SGD with Variance Reduction beyond Empirical Risk Minimization

Doubly Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization with Factorized Data

Distributed Stochastic Variance Reduced Gradient Methods and A Lower Bound for Communication Complexity

Stochastic Three-Composite Convex Minimization

عنوان ژورنال:

اشتراک گذاری